Reinforcement Learning with Soft

نویسندگان

Satinder P. Singh

Tommi Jaakkola

Michael I. Jordan

چکیده

It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortunately almost all of the theory of reinforcement learning assumes lookup table representations. In this paper we address the pressing issue of combining function approximation and RL, and present 1) a function approx-imator based on a simple extension to state aggregation (a commonly used form of compact representation), namely soft state aggregation, 2) a theory of convergence for RL with arbitrary, but xed, soft state aggregation, 3) a novel intuitive understanding of the eeect of state aggregation on online RL, and 4) a new heuristic adaptive state aggregation algorithm that nds improved compact representations by exploiting the non-discrete nature of soft state aggregation. Preliminary empirical results are also presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Composable Deep Reinforcement Learning for Robotic Manipulation

Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained usi...

متن کامل

Reinforcement learning in complementarity game and population dynamics.

We systematically test and compare different reinforcement learning schemes in a complementarity game [J. Jost and W. Li, Physica A 345, 245 (2005)] played between members of two populations. More precisely, we study the Roth-Erev, Bush-Mosteller, and SoftMax reinforcement learning schemes. A modified version of Roth-Erev with a power exponent of 1.5, as opposed to 1 in the standard version, pe...

متن کامل

Dynamic adaptation of quantization thresholds for soft-decision viterbi decoding with a reinforcement learning neural network

Two reinforcement learning neural network architectures which enhance the performance of a soft-decision Viterbi decoder used for forward error-correction in a digital communication system have been investigated and compared. Each reinforcement learning neural network is designed to work as a co-processor to a demodulator dynamically adapting the soft quantization thresholds toward optimal sett...

متن کامل

Reinforcement Learning in Neural Networks: A Survey

In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...

متن کامل